智能论文笔记

Flexible Bayesian Nonlinear Model Configuration

Aliaksandr Hubin , Geir Storvik , Florian Frommlet

分类： (统计)机器学习 | 机器学习

2020-03-05

回归模型用于各种应用，为来自不同领域的研究人员提供强大的科学工具。线性或简单的参数，模型通常不足以描述输入变量与响应之间的复杂关系。通过诸如神经网络的灵活方法可以更好地描述这种关系，但这导致不太可解释的模型和潜在的过度装备。或者，可以使用特定的参数非线性函数，但是这种功能的规范通常是复杂的。在本文中，我们介绍了一种灵活的施工方法，高度灵活的非线性参数回归模型。非线性特征是分层的，类似于深度学习，但对要考虑的可能类型的功能具有额外的灵活性。这种灵活性，与变量选择相结合，使我们能够找到一小部分重要特征，从而可以更具可解释的模型。在可能的功能的空间内，考虑了贝叶斯方法，基于它们的复杂性引入功能的前沿。采用遗传修改模式跳跃马尔可夫链蒙特卡罗算法来执行贝叶斯推理和估计模型平均的后验概率。在各种应用中，我们说明了我们的方法如何用于获得有意义的非线性模型。此外，我们将其预测性能与多个机器学习算法进行比较。

translated by 谷歌翻译

DisCoScene: Spatially Disentangled Generative Radiance Fields for Controllable 3D-aware Scene Synthesis

Yinghao Xu , Menglei Chai , Zifan Shi , Sida Peng , Ivan Skorokhodov , Aliaksandr Siarohin , Ceyuan Yang , Yujun Shen , Hsin-Ying Lee , Bolei Zhou

分类：计算机视觉

2022-12-22

Existing 3D-aware image synthesis approaches mainly focus on generating a single canonical object and show limited capacity in composing a complex scene containing a variety of objects. This work presents DisCoScene: a 3Daware generative model for high-quality and controllable scene synthesis. The key ingredient of our method is a very abstract object-level representation (i.e., 3D bounding boxes without semantic annotation) as the scene layout prior, which is simple to obtain, general to describe various scene contents, and yet informative to disentangle objects and background. Moreover, it serves as an intuitive user control for scene editing. Based on such a prior, the proposed model spatially disentangles the whole scene into object-centric generative radiance fields by learning on only 2D images with the global-local discrimination. Our model obtains the generation fidelity and editing flexibility of individual objects while being able to efficiently compose objects and the background into a complete scene. We demonstrate state-of-the-art performance on many scene datasets, including the challenging Waymo outdoor dataset. Project page: https://snap-research.github.io/discoscene/

translated by 谷歌翻译

Training and Tuning Generative Neural Radiance Fields for Attribute-Conditional 3D-Aware Face Generation

Jichao Zhang , Aliaksandr Siarohin , Yahui Liu , Hao Tang , Nicu Sebe , Wei Wang

分类：计算机视觉

2022-08-26

基于生成神经辐射场（GNERF）基于生成神经辐射场（GNERF）的3D感知gan已达到令人印象深刻的高质量图像产生，同时保持了强3D一致性。最显着的成就是在面部生成领域中取得的。但是，这些模型中的大多数都集中在提高视图一致性上，但忽略了分离的方面，因此这些模型无法提供高质量的语义/属性控制对生成。为此，我们引入了一个有条件的GNERF模型，该模型使用特定属性标签作为输入，以提高3D感知生成模型的控制能力和解散能力。我们利用预先训练的3D感知模型作为基础，并集成了双分支属性编辑模块（DAEM），该模块（DAEM）利用属性标签来提供对生成的控制。此外，我们提出了一个Triot（作为INIT的训练，并针对调整进行优化），以优化潜在矢量以进一步提高属性编辑的精度。广泛使用的FFHQ上的广泛实验表明，我们的模型在保留非目标区域的同时产生具有更好视图一致性的高质量编辑。该代码可在https://github.com/zhangqianhui/tt-gnerf上找到。

translated by 谷歌翻译

HTML版本

3D-Aware Semantic-Guided Generative Model for Human Synthesis

Jichao Zhang , Enver Sangineto , Hao Tang , Aliaksandr Siarohin , Zhun Zhong , Nicu Sebe , Wei Wang

分类：计算机视觉

2021-12-02

最近已经示出了从2D图像中提取隐式3D表示的生成神经辐射场（GNERF）模型，以产生代表刚性物体的现实图像，例如人面或汽车。然而，他们通常难以产生代表非刚性物体的高质量图像，例如人体，这对许多计算机图形应用具有很大的兴趣。本文提出了一种用于人类图像综合的3D感知语义导向生成模型（3D-SAGGA），其集成了GNERF和纹理发生器。前者学习人体的隐式3D表示，并输出一组2D语义分段掩模。后者将这些语义面部掩模转化为真实的图像，为人类的外观添加了逼真的纹理。如果不需要额外的3D信息，我们的模型可以使用照片现实可控生成学习3D人类表示。我们在Deepfashion DataSet上的实验表明，3D-SAGGAN显着优于最近的基线。

translated by 谷歌翻译

First Order Motion Model for Image Animation

Aliaksandr Siarohin , Stéphane Lathuilière , Sergey Tulyakov , Elisa Ricci , Nicu Sebe

分类：

2020-02-29

Image animation consists of generating a video sequence so that an object in a source image is animated according to the motion of a driving video. Our framework addresses this problem without using any annotation or prior information about the specific object to animate. Once trained on a set of videos depicting objects of the same category (e.g. faces, human bodies), our method can be applied to any object of this class. To achieve this, we decouple appearance and motion information using a self-supervised formulation. To support complex motions, we use a representation consisting of a set of learned keypoints along with their local affine transformations. A generator network models occlusions arising during target motions and combines the appearance extracted from the source image and the motion derived from the driving video. Our framework scores best on diverse benchmarks and on a variety of object categories. Our source code is publicly available 1 .

translated by 谷歌翻译